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Teachers are important, and policies mandating high-stakes evaluations of teachers are at 
the forefront of popular school reforms. Today’s dominant approach labels teachers as 
effective or ineffective based in large part on a statistical analysis of students’ test -score 
performance. Teachers judged effective are rewarded, and those found ineffective are 
sanctioned. 

While such summative evaluations can be useful, lawmakers should be wary of approaches 
based in large part on test scores: the error in the measurements is large— which results in 
many teachers being incorrectly labeled as effective or ineffective ; 1 relevant test scores are 
not available for the students taught by most teachers, given that only certain grade levels 
and subject areas are tested; and the incentives created by high -stakes use of test scores 
drive undesirable teaching practices such as curriculum narrowing and teaching to the 
test . 2 

Summative initiatives should also be balanced with formative approaches, which identify 
strengths and weaknesses of teachers and directly focus on developing and improving their 
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teaching. Measures that de-emphasize test scores are more labor intensive but have far 
greater potential to enrich instruction and improve education. 

Teacher quality is among the most important within-school factors affecting student 
achievement. However, research also suggests that teacher differences account for no more 
than about 15% of differences in students’ test score outcomes. 3 Other school factors such 
as class size reduction and adequate, focused funding 3 are also research-based ways to 
improve education. Further, non-school factors , which are generally associated with 
parental education and wealth, are far more important determinants of students’ test 
scores. 6 

Care must be taken in selecting or designing a balanced evaluation system. Given the 
extensive range of activities, skills, and knowledge involved in teachers’ daily work, the 
system’s goals must be clear, explicit and reflect practitioner involvement. 7 Effective 
teacher evaluation also requires an investment in sufficient numbers of qualified 
evaluators. Otherwise, the system will likely be irregular, uneven and ineffective. 8 

Many established evaluation systems are available, and some have a strong research base. 
Among the more widely known approaches are Charlotte Danielson’s Framework for 
Teaching 9 and the Peer Assistance and Review (PAR) 10 approach. Connecticut’s Beginning 
Educator’s Support and Training (BEST) system along with the National Board for 
Professional Teaching Standard’s system for advanced teachers are also recognized as 
promising systems for promoting both student learning and professional improvement. 11 
Properly preparing teachers is also receiving renewed attention, and Stanford’s edTPA 
consortium of 24 states is developing comprehensive assessments of prospective 
teachers. 12 

Any single measure of teaching or teachers will emphasize one important element at the 
expense of others. 13 Accordingly, all teacher evaluation systems should employ a diverse 
set of measures to capture the complex nature of the art and science of teaching. 14 In fact, 
the wisest choice may be to have two or more separate measurement systems within a 
district, allowing for the possibility of different results— which in turn would provide a 
check and a caution against relying on only one measurement system. 

Key Research Points and Advice for Policymakers 

• If the objective is improving educational practice, formative evaluations that guide 
a teacher’s improvement provide greater benefits than summative evaluations. 15 

• If the objective is to improve educational performance, outside-school factors must 
also be addressed. Teacher evaluation cannot replace or compensate for these much 
stronger determinants of student learning. 16 The importance of these outside-school 
factors should also caution against policies that simplistically attribute student test 
scores to teachers. 

• The results produced by value-added (test-score growth) models alone are highly 
unstable. They vary from year to year, from classroom to classroom, and from one 

20/5 


http:/ / nepc.colorado.edu/ publication/ options 


test to another. 17 Substantial reliance on these models can lead to practical, ethical 
and legal problems. 

• High-stakes evaluations based in substantial part on students’ test scores narrow 
the curriculum by diminishing or pushing out non-tested subjects, knowledge, and 
skills. 18 

• Teacher evaluation systems necessarily involve trade-offs, and specific design 
choices are controversial, so it is important to involve all key stakeholders in system 
design or selection. 19 

• To be successful, schools must invest in their teacher evaluation systems. An 
adequate number of highly trained evaluators must be available. 20 

• Given the wide variety of teacher roles and the many factors that influence learning 
that are outside the control of the teacher, a wide variety of measures of teacher 
effectiveness is also indicated. 21 By diversifying, the weakness of any single measure 
is offset by the strengths of another. 22 

• High-quality research on existing evaluative programs and tools should inform the 
design of teacher evaluation systems. 23 States and districts should investigate 
balanced models such as PAR and the Danielson Framework, closely examine the 
evidence concerning strengths and weaknesses of each model, and never attach 
high-stakes consequences to teachers which the evidence cannot validly support. 
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